Fast Structural Similarity Search Based on Topology String Matching
نویسندگان
چکیده
We describe an abstract data model of protein structures by representing the geometry of proteins using spatial data types and present a framework for fast structural similarity search based on the matching of topology strings using bipartite graph matching. The system has been implemented on top of the Oracle 9i spatial database management system. The performance evaluation was conducted on 36 proteins from the Chew and Kedem data set and also on a subset of the PDB40. Our method performs well in terms of the quality of matching whilst having the advantage of fast execution and being able to compute similarity search in polynomial time. Thus, this work shows that the pre-computed string representation of topological properties between secondary structure elements using spatial relationships of spatial database management system is practical for fast structural similarity search.
منابع مشابه
A Method of Structure Comparison using Spatial Topological Patterns
The problem of comparison of structural similarity has been complex and computationally expensive. The first step to solve comparison of structural similarity in 3D structure databases is to develop fast methods for structural similarity. Therefore, we propose a new method of comparing structural similarity in protein structure databases by using topological patterns of proteins. In our approac...
متن کاملSemantics-Sensitive Math-Similarity Search
The increasingly available electronic math contents demand a suitable search engine to help users search and retrieve. However, the unique structural syntax and the variety of semantic equivalences of mathematic expressions make it a challenge for a keyword-based text search engine to effectively meet the users’ search needs. Many existing math search solutions focus on exact search where the n...
متن کاملCombining Index Structures for application-specific String Similarity Predicates
This paper presents new approaches for supporting string similarity matching based on a combination of techniques from the fields of information technology and computational linguistics to achieve better results regarding accuracy and efficiency. The homogenization of plain text reduces the volume of index structures and concurrently increases the quality of hit-lists. Furthermore it shows the ...
متن کاملRobust and Fast Lyric Search based on Phonetic Confusion Matrix
This paper proposes a robust and fast lyric search method for music information retrieval. Current lyric search systems by normal text retrieval techniques are severely deteriorated in the case that the queries of lyric phrases contain incorrect parts due to mishearing and misremembering. To solve this problem, the authors apply acoustic distance, which is computed based on a confusion matrix o...
متن کاملSupporting Similarity Operations Based on Approximate String Matching on the Web
Querying and integrating sources of structured data from the Web in most cases requires similarity-based concepts to deal with data level conflicts. This is due to the often erroneous and imprecise nature of the data and diverging conventions for their representation. On the other hand, Web databases offer only limited interfaces and almost no support for similarity queries. The approach presen...
متن کامل